-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-9377. Implement some metrics with RW-lock instead of synchronization #7708
base: master
Are you sure you want to change the base?
Conversation
2f66e63
to
907cb96
Compare
Thanks @juncevich for the patch. It seems this removes synchronization in some individual metrics, but not in My 2 cents: I don't think we should reimplement Hadoop metrics system in Ozone with tweaks. Either improve it in Hadoop to benefit the ecosystem, or use a better existing metrics library in Ozone. |
Thanks for the notices. Could you tell me more about the |
Not "existing in Ozone", I meant "existing somewhere else, which we could start using in Ozone".
I consider (1) to be the worst option, both for Ozone and globally. |
I think (2) is the best option, but it takes the longest. I will ask about not synchronizing metrics in the Hadoop project. And will return with the answer. |
I looked at communication in Hadoop(mailing lists) and discussion and realization can take a long time ) |
@juncevich I have a few questions, what is the desired impact of this code change? Do you see a performance difference? How much is it? Answers to those questions will help us decide if it is ok to duplicate code but fix it to gain benefits. In the linked jira you do describe the problem but do not mention how much this PR will help solve the issue. Also, internal improvements sometimes do not lead to externally visible improvements, any details on how it helped speed up the test you'll deviced? |
@kerneltime The desired impact is to get rid of the locks in metrics collecting and improve writing speed. Without a fix, it was a 3000-3200 writing operation (CreateFile or CommitKey). After fixing it was 3600-3700. Improvement of more than 12%. |
If we are to replace the metric implementation, I would go with something like dropwizard or another existing package to reuse after comparing performance.
This is a decent improvement, @adoroszlai should we initiate an upstream discussion with Hadoop while we see if we can merge this change in. I think we need a dual strategy
|
I agree. Please also take a look at Ratis' metrics API, for which Ozone already uses the Dropwizard implementation.
|
69a55f3
to
997884c
Compare
df94f80
to
0ac69a3
Compare
private MutableRate hsyncSynchronizedWorkNs; | ||
private OzoneMutableRate hsyncSynchronizedWorkNs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, we should keep the original class names. Package name should be enough to distinguish between Hadoop and Ozone class. This would help reduce code change mostly to imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Will fix.
* A convenient mutable metric for throughput measurement. | ||
* (non-synchronized version of org.apache.hadoop.metrics2.lib.MutableStat) | ||
*/ | ||
public class OzoneMutableStat extends MutableStat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it supposed to extend MutableStat
? This duplicates all member variables. Can we copy all necessary classes without subclassing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All metrics should extend MutableMetric. I will change extending MutableStat to MutableMetric.
/** | ||
* Tests for SampleQuantiles. | ||
*/ | ||
class SampleQuantilesTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests named *Test
are not executed:
Line 1844 in 6f82e7b
<include>**/Test*.java</include> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed.
35f0684
to
63b0cd3
Compare
What changes were proposed in this pull request?
Change synchronized metrics collect methods to read/write lock implementation.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-9377
How was this patch tested?
I wrote the JUnit test (I'm not sure I should add this test to this PR) to compare performance metric realization.